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Abstract. We study the dynamics of the 'batch' minority game writh market- 
impact correction using generating functional techniques to carry out the 
quenched disorder average. We find that the assumption of weak long-term 
memory, which one usually makes in order to calculate ergodic stationary states, 
breaks down when the persistent autocorrelation becomes larger than Cc ~ 0.772. 
We show that this condition, remarkably, coincides with the AT-line found in 
an earlier static calculation. This result suggests a new scenario for ergodicity 
breaking in disordered systems. 

The minority game|§| (MG) models a market of speculators interacting through a 
simple supply-and-demand mechanism ^ . One of the key behavioural assumptions of 
the original model is that agents act as so-called price-takers, meaning that at every 
stage of the game each of them only perceives the aggregate action of all agents, i.e. 
the total bid. Recently, in |^], a generalization has been introduced in which agents 
are able to estimate their own contribution to the total bid and use this additional 
information to adjust their learning dynamics and optimize their performance. The 
statics of this model has been tackled by spin-glass techniques in ||, ^ along the lines 
of Q. The system was found to approximately minimize a disordered hamiltonian H 
whose minima could be calculated with the replica method. It was shown that replica- 
symmetry breaking (RSB) can occur, implying the existence of multiple stationary 
states. 

In this Letter we adapt the dynamical method used in to analyze the 'batch' 
version of this model. Assuming time-translation invariance, finite integrated response 
and weak long-term memory Q we obtain exact results for the stationary state 
which are in excellent agreement with computer experiments and with earlier static 
approaches. Moreover, we derive a condition for the continuous onset of memory, 
where the assumption of weak long-term memory is found to fail while time-translation 
invariance still holds. This appears to be different from the usual aging scenario in 
non-ergodic disordered systems. Remarkably, the memory-onset condition coincides 
with the AT line found in statics. 

We begin by recalling the definition of the model. We consider N agents labeled 
by roman indices. At each iteration round n all agents receive the same information 
pattern fJ.{n) drawn at random with uniform probability from {!,..., a A}. Each 
agent has at his disposal S different strategies (labeled by g = 1, . . . , S*) to convert 

§ See the web page www.unifr.ch/econophysics/minority for an extensive and commented overview 
of the existing literature. 
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the acquired information into a trading decision. Strategies are denoted by aN- 
dimensional vectors: a^g = {'^ig}'jl=i ^ { — ^A}"^: where a^^ is the trading action 
(e.g. +1 for 'buy', —1 for 'sell') prescribed to agent i by his g-th strategy given 
receipt of information fi. By assumption, each component af^ is selected randomly 
and independently from {—1, 1} with uniform probabilities before the start of the 
game, for all i, g and /i. This introduces quenched disorder into the model. Each 
strategy of every agent is given an initial valuation pig{0), which is updated at the end 
of every round. At the start of round n, given /x(n), every agent selects the strategy 
with the highest valuation, gi{n) = argmaxpig(n), and subsequently makes a bid 

according to the trading decision set by the selected strategy: bi{n) = o^^."^)- The 

total bid at round n is defined as A{n) — -^^^^^^X^iLi bi{n). Finally, for all i and g 
all payoffs are updated according to a reinforcement learning dynamics of the form 



(1) 



'N 

and agents move to the next round. The first term in square brackets embodies the 
minority rule, in that the valuation of a strategy is increased every time it predicts 
the correct minority action, independently of it having been actually used. The term 
proportional to r/ adjusts the total bid for the possibility that agent i is not using 
strategy gi{n). For 77 = one returns to the original MG, while for rj — 1 the total 
bid is completely adjusted. 

We focus on the case g — 1,2. Introducing the variables yi{n) — [pii{n) — 
Pi2("-)]/2, as well as the aTV-dimensional vectors u^i — {an + 0^2)72, O = 
N"^/'^ Z^ili and |j = (a^ - ai2)/2, and defining Si{n) = sgn[j/j(n)] one has 

1 ^ 

j = l V 

Following 1^ we study in this paper a 'batch' version of the model, which is obtained 
by averaging (|^) over information patterns: 

N 

y^{t + l) = yi{t)-h,-Y^ JijSj{t) + rjasiit) + 0^{t) (3) 
i=i 

where t is a re-scaled time, hi — {2/^/N) Vt ■ and Jy — i2/N) ■ ^j. The external 
field 9i{t) has been added for later use. In contrast to the more usual 'on-line' model 
(^, where the ?/i's are updated after every iteration step, in the 'batch' case the 
updates are made on the basis of the average effect of all possible choices of /i. This 
modified dynamics yields results for the stationary state which are quantitatively very 
similar to those of the original model [Q. The theoretical advantage of the 'batch' 
formulation is that it circumvents the difficulty of constructing a proper continuous 
time limit. The numerical advantage is that one can simulate larger systems for a 
longer time. Following Q, one derives the effective non-linear single-agent equation 

y{t + 1) = yit) - a 51(1 + ^V^i*') + + V^^W + m (4) 

t'<t 

where s{t) = sgn[j/(t)] and z{t) is a Gaussian noise with zero mean and temporal 
correlations given by 

{zit)zit')) ^ Hu' = E(l + G)r.'(E + C)sA\ + (5) 
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The matrices C and G appearing here are the noise-averaged single-agent correlation 
and response functions for the process with elements 

ds{t) 



Ctt' = {s{tMt')) 



and 



Gt 



de{t') 



(6) 



respectively, while I is the identity matrix and E denotes the matrix with all entries 
equal to one. The link between the Markovian multi-agent system (^) and the non- 
Markovian single-agent process (^) is established by the fact that, for N ^ oo, Cu' 
and Gtt' become identical to the disorder- and agent-averaged correlation and response 
fimctions of (^): 

N , N 



l^hWs,(t')]dis and Gtt' = ^J2 



ds.it) 



d6i{t') 



(7) 



= 1 i=l 

Eqs describe the dynamics of the system exactly in the N ^ oo limit. We 
now move to the stationary states of upon making the following assumptions: 
Time-translation invariance (TTI) linii^oo Gt+r.t = G{t) , limt^oo Gt+T,t = G{t) 
Finite integrated response (FIR) limt_,oo X]t'<t Gtt' — x < 
Weak long-term memory (WLTM) limt_oo G{t^t') = Vt' finite 
For the re-scaled quantity y = limf^oo y{t)/t one finds 
as 

y — ~— \- arjs + ^/az + 9 (8) 



where s = Muir- 



1 + X 

Et<r sgn[2/(t)] and z 



limT-^oo r ^X]t<r^(Oj while 9 is 



static field. The variance of the zero-average Gaussian random variable z can be 
calculated from (||), yielding 

1 + c 



lim 



t<T t'<r' 



Htt' = 



(9) 



with the persistent correlation c = (s^) — limr^oo Et<r G{t). The effective agent 
is 'frozen' if ^ 7^ 0, so that s = sgn(y) and he is always employing the same strategy. 
Setting = 0, this is easily seen to be the case if \z\ > 7 with 7 = ^/a [(1 -I- x)~^ ~ v] j 
provided 7 > 0. He is instead fickle when y = or \z\ < 7, and in this case s = 2/7. 
A self-consistent equation for c can now be derived by separating the contribution of 
the frozen agents from that of the fickle ones. Upon defining A = 7/ \/{z^ one finds 



(e(|z|-7)) + (e(7-kl) 



r 



1 

A2 



0- 




(10) 



where 8 is the step function, (p = erf(A/\/2) is the fraction of fickle agents, and 
(/) = 1 — (/) is the fraction of frozen agents. For x = (w) ^ ct^^^^ (if) obtains 



1 



{Q{\z\-^)2S{V^z)) 



1 



(6(7 -kl)) 



(11) 



/a lyct Ivct 

Equations (|[]l|) for m a closed set from which one can solve for 0, c and x for any a 
and rj. Results for c are shown in Figure 1. 

For negative rj, one observes an excellent agreement between theory and 
experiment for all values of a, implying that none of our assumptions is ever violated. 

When T] = 0, we recover the results of |j , which match the simulations perfectly 
for a larger than the critical value ac — 0.3374. At this point the integrated response 
X diverges (FIR is violated) and a transition to a highly non-ergodic regime takes 
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place, where the stationary state depends on the initial conditions y(0). Starting 
with y{0) ~ leads to a high volatility state, while starting with \y{0)\ » 1 leads to 
relatively low volatility. The latter regime can be solved using the assumption that x 
remains very large for all a < ac- In fact, if % 3> 1 then 7 ~ y^/x so that (j) — a, 
which is equivalent to erf(A/V2) = a. Solving this for A and inserting the resulting 
value in (^0|), we obtain the top left branch of the 77 = curve in Fig. 1, which is 
again in excellent agreement with numerical results. 

For positive rj, one sees that when c > Cc — 0.77 our theoretical predictions 
deviate from the experimental observations, whereas the agreement is perfect for 
c < Cc- Finding no violation of FIR, we have to conclude that either TTI or WLTM 
is violated. However, we have found no evidence of aging. Therefore we expect the 
deviations to be related to the breakdown of WLTM only. To find the onset of memory, 
we split Gtt' in its TTI part and its non-TTI part: 

lim Gw ^G{t-t') + G{t,t') (12) 

f — )-oo 

During the initial stages of the game, small perturbations can cause some agents, 
which would otherwise have remained fickle, to freeze and vice versa, thus creating 
a persistent part G in the response function. As the agents freeze, their state (and 
consequently their contribution to G) becomes independent of t, so that we expect 
limt^oo G{t,t') — G{t'). After an initial equilibration period, for all frozen agents 
the difference between the strategy valuations have become very large, so they are 
virtually insensitive to perturbations. Hence we must assume that limf^oo Cl{t') = 0. 
The fickle agents, however, remain sensitive to small perturbations. The effects will 
wear out over time (finite response) and are given by G. 

Assuming G is small, we expand (I + G)^^ in powers of G up to first order: 



(I + G)-i = (I + G)-i - X! (-G)'"G(-G)"-™~i + 0(G2). (13) 

n— m— 

Defining x — X^t ^(^) ^^"^ X — Tit ^(^)' then finds asymptotically 



y 



1 



,1 + X 

00 n — 1 

+ " E E (-^)" E ^{t') J2 [(-G)"—^] {t',t")s{t") (14) 

n—O m—0 t' t" 

Using the rectified linear function /(.x) = x for |x| < 1 and sgn(x) otherwise, we see 
that if 1/(1 + x) > 77 then 

CO 11—1 




v^E E ("x)™ E G{t') [(-G)"-™-i] {f, t")s{t 



n— m—0 



(15) 



7 — ya 



1 



1 + X 



As before, we have x = ot (ff): whereas 
/ ds \ _ ^ 

n—O m—0 



/ rj \ I — 00 n — i 

G{t) - ( ^ ) - ^ E E (-^)" E G{t') E [(-G)"-™-^] {f, t")G{t", t) (16) 
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Up to first order in G one finds x — XXv^/ili^ + x)^] + 0{G^). Although x = is 
always a solution of this equation, a bifurcation occurs when x\/a/[7(l + x)^] = 1; 
which is equivalent to (/> = a[l — 7^(1 + x)]^, and can be written in terms of A as 

A2[1 + c(A)] =0(A) (17) 

We call this line in the (a, 77) plane the memory-onset (MO) line, see Figure 2. It 
coincides remarkably with the AT-line (see Appendix), and implies that the bifurcation 
occurs at Cc — 0.7722 for 77 > 0. Above this value, WLTM can be broken, and indeed 
one sees from Figure 1 that numerical results deviate from our theoretical predictions 
for c > Cc- To give further evidence of memory, we have analyzed the time evolution 
of two identical copies a and b of the system, starting from slightly different initial 
conditions. We plotted in Figure 3 the distance d of the stationary states, given by 
{l/N) J2ii^i ~ ■^i)^' where s™ is the long-time average of sgn(y™) (™ = versus 
the persistent autocorrelation of copy a, c°. As approaches Cc, the two copies end 
up in different stationary states, proving that they remember initial condition^. At 
the same time, if a perturbation is applied much later during the run the copies end 
up in the same stationary state, indicating that indeed G(<') — > as t' — > 00. 

Summarizing, we have shown that in this model the usual connection between 
broken ergodicity and broken TTI (aging), as seen for instance in mean- field spin 
glasses does not occur. In contrast, we derive from the dynamics a condition for 
breakdown of WLTM and continuous onset of memory within the TTI regime, which 
is found to be equivalent to the AT-line found in the static approach. This remarkable 
deviation from the well-known RSB / aging picture is possibly due to the fact that the 
microscopic dynamics of our model does not satisfy detailed balance. 

We gratefully acknowledge support from and useful discussions with A C C Coolen 
and M Marsili, and with S Franz. We also thank SISSA and King's College London for 
reciprocal hospitality. This work originated at the International Seminar on Statistical 
Mechanics of Information Processing in Cooperative Systems (Dresden, March 2001). 
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Appendix 

While investigating the possible relation between our MO line and replica-symmetry 
breaking, it became apparent that in the very final step of the AT-line calculation 

II The slight bump that occurs before Cc is likely due to the fact that in our simulation the perturbation 
can not be infinitesimal, but is at least 1/A'^. 
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in 1^ a small error has occurred. It was found that the replica-symmetric solution 
becomes unstable when 

lim(^(3'[{s')-{.sfy'j ={l + x? (18) 

where (/(s)) = Zp{z)-'^ J^^ f{s)e~''^^'('Us and Zp{z) = J^^e-f^^'^'Us, with V,(s) = 
^75^ — zs. The brackets (• • •)^ denote a Gaussian average over z having zero mean 
and variance (z^)^ = (1 + 9)7(1 + x)^, q being the overlap between two different 
replicas (off-diagonal overlap matrix element). We have absorbed a spurious factor 
i/a in /3. If we now define F{z) — — lim^_,oo log ^'^(2:), the AT line can be 
written as (F"(z)^)^ = (1 + By Laplace's method we find F{z) = Vz{so), so 
being the minimum of Vz in [—1,1]. For \z/"f\ < 1, sq lies inside this interval and 
^2(30) = 2^/(27), while for \z/j\ > 1 sq is on the border and Vz{so) = 7/2— \z\. This 
gives second derivatives that are —7""'^ and 0, respectively. The AT-line is therefore 
given by (7~^9(1 — \z/^\))z + (0 9(|2;/7| — l))z = (1 -I- x)^- Recognizing the non- 
vanishing term on the l.h.s. as the fraction (p of fickle agents, we find 

a[l-77(l+x)]'-0 (19) 

similar to the result of |^ where in place of (/) a 1 was reported. Written in terms of 
A, the AT line is identical to the MO line (pT|). We learned that the AT-line can also 
be derived from the dynamical stability of Eqs. (§)[§|. 
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Figure 1. The persistent correlation c as a function of a for different values 
of T). Lines represent theoretical predictions. Solid lines: from bottom to top, 
7? = -1,-0.5,-0.25,0,0.25,0.5,0.7. Dashed line: y(0) » 0,»y = 0. Dot-dashed 
line: r\ = Q~ . Diamonds correspond to computer simulations with aN'^ = 10, 000, 
run for 500 time steps and averaged over 50 disorder samples. 
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Figure 2. The solid line represents tjie MO (=AT) line. The dashed line 
corresponds to the AT line reported in B. 
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Figure 3. Distance between the stationary states of two identical copies of the 
system (sec text) as a function of their persistent autocorrelation. Simulations are 
for various levels of r] with TV = 450, averaged over 100 samples. Open markers 
correspond to a perturbation at t = (O , <;) for a = 1, 2, 4). Closed markers 
correspond to a perturbation at 4 = 500 {♦ for a = 2). All simulations are run up 
to 500 steps after the perturbation occurs. Time averages are over the last 300 
steps. 



